Skip to content

Conversation

@gandhi56
Copy link
Contributor

@gandhi56 gandhi56 commented Sep 19, 2025

Introduce add/sub support for S64 and V2S16 types. Additionally, add rules for G_UADDO, G_USUBO, G_UADDE and G_USUBE as they are needed for S64 addition/subtraction.

This comment was marked as resolved.

@llvmbot
Copy link
Member

llvmbot commented Sep 19, 2025

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-amdgpu

Author: Anshil Gandhi (gandhi56)

Changes

This patch adds support for S64 types and updates tests accordingly.


Full diff: https://github.com/llvm/llvm-project/pull/159860.diff

4 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp (+3-1)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-add.s16.mir (+6-11)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-add.s64.mir (+76)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-add.v2s16.mir (+6-16)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
index 0776d14a84067..ffc6cd977371b 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
@@ -470,7 +470,9 @@ RegBankLegalizeRules::RegBankLegalizeRules(const GCNSubtarget &_ST,
       .Uni(S16, {{Sgpr32Trunc}, {Sgpr32AExt, Sgpr32AExt}})
       .Div(S16, {{Vgpr16}, {Vgpr16, Vgpr16}})
       .Uni(S32, {{Sgpr32}, {Sgpr32, Sgpr32}})
-      .Div(S32, {{Vgpr32}, {Vgpr32, Vgpr32}});
+      .Div(S32, {{Vgpr32}, {Vgpr32, Vgpr32}})
+      .Uni(S64, {{Sgpr64}, {Sgpr64, Sgpr32}})
+      .Div(S64, {{Vgpr64}, {Vgpr64, Vgpr32}});
 
   addRulesForGOpcs({G_MUL}, Standard).Div(S32, {{Vgpr32}, {Vgpr32, Vgpr32}});
 
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-add.s16.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-add.s16.mir
index 54ee69fcb2204..6e16b2aeea0f8 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-add.s16.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-add.s16.mir
@@ -1,6 +1,6 @@
 # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
-# RUN: llc -mtriple=amdgcn -mcpu=fiji -run-pass=regbankselect %s -verify-machineinstrs -o - -regbankselect-fast | FileCheck %s
-# RUN: llc -mtriple=amdgcn -mcpu=fiji -run-pass=regbankselect %s -verify-machineinstrs -o - -regbankselect-greedy | FileCheck %s
+# RUN: llc -mtriple=amdgcn -mcpu=fiji -run-pass=amdgpu-regbankselect %s -verify-machineinstrs -o - -regbankselect-fast | FileCheck %s
+# RUN: llc -mtriple=amdgcn -mcpu=fiji -run-pass=amdgpu-regbankselect %s -verify-machineinstrs -o - -regbankselect-greedy | FileCheck %s
 ---
 name: add_s16_ss
 legalized: true
@@ -15,11 +15,8 @@ body: |
     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1
     ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32)
     ; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY1]](s32)
-    ; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:sgpr(s32) = G_ANYEXT [[TRUNC]](s16)
-    ; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:sgpr(s32) = G_ANYEXT [[TRUNC1]](s16)
-    ; CHECK-NEXT: [[ADD:%[0-9]+]]:sgpr(s32) = G_ADD [[ANYEXT]], [[ANYEXT1]]
-    ; CHECK-NEXT: [[TRUNC2:%[0-9]+]]:sgpr(s16) = G_TRUNC [[ADD]](s32)
-    ; CHECK-NEXT: S_ENDPGM 0, implicit [[TRUNC2]](s16)
+    ; CHECK-NEXT: [[ADD:%[0-9]+]]:sgpr(s16) = G_ADD [[TRUNC]], [[TRUNC1]]
+    ; CHECK-NEXT: S_ENDPGM 0, implicit [[ADD]](s16)
     %0:_(s32) = COPY $sgpr0
     %1:_(s32) = COPY $sgpr1
     %2:_(s16) = G_TRUNC %0
@@ -42,8 +39,7 @@ body: |
     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0
     ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32)
     ; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY1]](s32)
-    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:vgpr(s16) = COPY [[TRUNC]](s16)
-    ; CHECK-NEXT: [[ADD:%[0-9]+]]:vgpr(s16) = G_ADD [[COPY2]], [[TRUNC1]]
+    ; CHECK-NEXT: [[ADD:%[0-9]+]]:vgpr(s16) = G_ADD [[TRUNC]], [[TRUNC1]]
     ; CHECK-NEXT: S_ENDPGM 0, implicit [[ADD]](s16)
     %0:_(s32) = COPY $sgpr0
     %1:_(s32) = COPY $vgpr0
@@ -67,8 +63,7 @@ body: |
     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
     ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32)
     ; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY1]](s32)
-    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:vgpr(s16) = COPY [[TRUNC1]](s16)
-    ; CHECK-NEXT: [[ADD:%[0-9]+]]:vgpr(s16) = G_ADD [[TRUNC]], [[COPY2]]
+    ; CHECK-NEXT: [[ADD:%[0-9]+]]:vgpr(s16) = G_ADD [[TRUNC]], [[TRUNC1]]
     ; CHECK-NEXT: S_ENDPGM 0, implicit [[ADD]](s16)
     %0:_(s32) = COPY $vgpr0
     %1:_(s32) = COPY $sgpr0
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-add.s64.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-add.s64.mir
new file mode 100644
index 0000000000000..eb93868ccc359
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-add.s64.mir
@@ -0,0 +1,76 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1200 -run-pass=amdgpu-regbankselect %s -o - | FileCheck %s
+
+---
+name: add_s64_ss
+legalized: true
+
+body: |
+  bb.0:
+    liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
+    ; CHECK-LABEL: name: add_s64_ss
+    ; CHECK: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:sgpr(s64) = COPY $sgpr0_sgpr1
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:sgpr(s64) = COPY $sgpr2_sgpr3
+    ; CHECK-NEXT: [[ADD:%[0-9]+]]:sgpr(s64) = G_ADD [[COPY]], [[COPY1]]
+    %0:_(s64) = COPY $sgpr0_sgpr1
+    %1:_(s64) = COPY $sgpr2_sgpr3
+    %2:_(s64) = G_ADD %0, %1
+...
+
+---
+name: add_s64_sv
+legalized: true
+
+body: |
+  bb.0:
+    liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+    ; CHECK-LABEL: name: add_s64_sv
+    ; CHECK: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:sgpr(s64) = COPY $sgpr0_sgpr1
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr(s64) = COPY $vgpr0_vgpr1
+    ; CHECK-NEXT: [[ADD:%[0-9]+]]:vgpr(s64) = G_ADD [[COPY]], [[COPY1]]
+    %0:_(s64) = COPY $sgpr0_sgpr1
+    %1:_(s64) = COPY $vgpr0_vgpr1
+    %2:_(s64) = G_ADD %0, %1
+...
+
+---
+name: add_s64_vs
+legalized: true
+
+body: |
+  bb.0:
+    liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+    ; CHECK-LABEL: name: add_s64_vs
+    ; CHECK: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:vgpr(s64) = COPY $vgpr0_vgpr1
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:sgpr(s64) = COPY $sgpr0_sgpr1
+    ; CHECK-NEXT: [[ADD:%[0-9]+]]:vgpr(s64) = G_ADD [[COPY]], [[COPY1]]
+    %0:_(s64) = COPY $vgpr0_vgpr1
+    %1:_(s64) = COPY $sgpr0_sgpr1
+    %2:_(s64) = G_ADD %0, %1
+...
+
+---
+name: add_s64_vv
+legalized: true
+
+body: |
+  bb.0:
+    liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
+    ; CHECK-LABEL: name: add_s64_vv
+    ; CHECK: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:vgpr(s64) = COPY $vgpr0_vgpr1
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr(s64) = COPY $vgpr2_vgpr3
+    ; CHECK-NEXT: [[ADD:%[0-9]+]]:vgpr(s64) = G_ADD [[COPY]], [[COPY1]]
+    %0:_(s64) = COPY $vgpr0_vgpr1
+    %1:_(s64) = COPY $vgpr2_vgpr3
+    %2:_(s64) = G_ADD %0, %1
+...
+
+
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-add.v2s16.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-add.v2s16.mir
index 97018fac13a87..cc278503d6965 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-add.v2s16.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-add.v2s16.mir
@@ -1,6 +1,6 @@
 # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
-# RUN: llc -mtriple=amdgcn -mcpu=gfx900 -run-pass=regbankselect %s -verify-machineinstrs -o - -regbankselect-fast | FileCheck %s
-# RUN: llc -mtriple=amdgcn -mcpu=gfx900 -run-pass=regbankselect %s -verify-machineinstrs -o - -regbankselect-greedy | FileCheck %s
+# RUN: llc -mtriple=amdgcn -mcpu=gfx900 -run-pass=amdgpu-regbankselect %s -verify-machineinstrs -o - -regbankselect-fast | FileCheck %s
+# RUN: llc -mtriple=amdgcn -mcpu=gfx900 -run-pass=amdgpu-regbankselect %s -verify-machineinstrs -o - -regbankselect-greedy | FileCheck %s
 
 ---
 name: add_v2s16_ss
@@ -14,16 +14,8 @@ body: |
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:sgpr(<2 x s16>) = COPY $sgpr0
     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:sgpr(<2 x s16>) = COPY $sgpr1
-    ; CHECK-NEXT: [[BITCAST:%[0-9]+]]:sgpr(s32) = G_BITCAST [[COPY]](<2 x s16>)
-    ; CHECK-NEXT: [[C:%[0-9]+]]:sgpr(s32) = G_CONSTANT i32 16
-    ; CHECK-NEXT: [[LSHR:%[0-9]+]]:sgpr(s32) = G_LSHR [[BITCAST]], [[C]](s32)
-    ; CHECK-NEXT: [[BITCAST1:%[0-9]+]]:sgpr(s32) = G_BITCAST [[COPY1]](<2 x s16>)
-    ; CHECK-NEXT: [[C1:%[0-9]+]]:sgpr(s32) = G_CONSTANT i32 16
-    ; CHECK-NEXT: [[LSHR1:%[0-9]+]]:sgpr(s32) = G_LSHR [[BITCAST1]], [[C1]](s32)
-    ; CHECK-NEXT: [[ADD:%[0-9]+]]:sgpr(s32) = G_ADD [[BITCAST]], [[BITCAST1]]
-    ; CHECK-NEXT: [[ADD1:%[0-9]+]]:sgpr(s32) = G_ADD [[LSHR]], [[LSHR1]]
-    ; CHECK-NEXT: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:sgpr(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[ADD]](s32), [[ADD1]](s32)
-    ; CHECK-NEXT: S_ENDPGM 0, implicit [[BUILD_VECTOR_TRUNC]](<2 x s16>)
+    ; CHECK-NEXT: [[ADD:%[0-9]+]]:sgpr(<2 x s16>) = G_ADD [[COPY]], [[COPY1]]
+    ; CHECK-NEXT: S_ENDPGM 0, implicit [[ADD]](<2 x s16>)
     %0:_(<2 x s16>) = COPY $sgpr0
     %1:_(<2 x s16>) = COPY $sgpr1
     %2:_(<2 x s16>) = G_ADD %0, %1
@@ -42,8 +34,7 @@ body: |
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:sgpr(<2 x s16>) = COPY $sgpr0
     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr(<2 x s16>) = COPY $vgpr0
-    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:vgpr(<2 x s16>) = COPY [[COPY]](<2 x s16>)
-    ; CHECK-NEXT: [[ADD:%[0-9]+]]:vgpr(<2 x s16>) = G_ADD [[COPY2]], [[COPY1]]
+    ; CHECK-NEXT: [[ADD:%[0-9]+]]:vgpr(<2 x s16>) = G_ADD [[COPY]], [[COPY1]]
     ; CHECK-NEXT: S_ENDPGM 0, implicit [[ADD]](<2 x s16>)
     %0:_(<2 x s16>) = COPY $sgpr0
     %1:_(<2 x s16>) = COPY $vgpr0
@@ -63,8 +54,7 @@ body: |
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:vgpr(<2 x s16>) = COPY $vgpr0
     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:sgpr(<2 x s16>) = COPY $sgpr0
-    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:vgpr(<2 x s16>) = COPY [[COPY1]](<2 x s16>)
-    ; CHECK-NEXT: [[ADD:%[0-9]+]]:vgpr(<2 x s16>) = G_ADD [[COPY]], [[COPY2]]
+    ; CHECK-NEXT: [[ADD:%[0-9]+]]:vgpr(<2 x s16>) = G_ADD [[COPY]], [[COPY1]]
     %0:_(<2 x s16>) = COPY $vgpr0
     %1:_(<2 x s16>) = COPY $sgpr0
     %2:_(<2 x s16>) = G_ADD %0, %1

@gandhi56 gandhi56 marked this pull request as draft September 22, 2025 15:54
@gandhi56 gandhi56 marked this pull request as ready for review September 24, 2025 17:38
@gandhi56 gandhi56 changed the title [AMDGPU] Add regbankselect rules for G_ADD/SUB [AMDGPU] Add regbankselect rules for G_ADD/SUB and variants Sep 25, 2025
@gandhi56 gandhi56 marked this pull request as draft October 9, 2025 14:12
@gandhi56 gandhi56 force-pushed the globalisel/g_addsub branch 2 times, most recently from 001179f to afb2207 Compare October 10, 2025 20:00
@gandhi56 gandhi56 force-pushed the globalisel/g_addsub branch from afb2207 to e5a5563 Compare October 19, 2025 17:40
@gandhi56 gandhi56 marked this pull request as ready for review October 19, 2025 17:51
@gandhi56 gandhi56 force-pushed the globalisel/g_addsub branch from e5a5563 to 3cceb3b Compare October 19, 2025 18:01
@gandhi56
Copy link
Contributor Author

Rebased

@github-actions
Copy link

github-actions bot commented Oct 19, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@gandhi56 gandhi56 force-pushed the globalisel/g_addsub branch from 3cceb3b to b227afa Compare October 19, 2025 21:30
@gandhi56 gandhi56 force-pushed the globalisel/g_addsub branch from e2a7645 to 67a5e6c Compare October 21, 2025 14:49
@gandhi56 gandhi56 force-pushed the globalisel/g_addsub branch from 67a5e6c to 923d749 Compare October 21, 2025 15:18
@gandhi56 gandhi56 force-pushed the globalisel/g_addsub branch 2 times, most recently from 5a59602 to 29d91cf Compare October 27, 2025 18:08
Introduce add/sub support for S64 and V2S16 types. Additionally,
add rules for G_UADDO, G_USUBO, G_UADDE and G_USUBE as they are
needed for S64 addition/subtraction.
@gandhi56 gandhi56 force-pushed the globalisel/g_addsub branch from 6eeb2ab to eccca2d Compare October 29, 2025 14:58
@gandhi56 gandhi56 force-pushed the globalisel/g_addsub branch from eccca2d to 4f49a0d Compare October 29, 2025 15:10
@gandhi56 gandhi56 force-pushed the globalisel/g_addsub branch from 4f49a0d to 57f328e Compare October 29, 2025 19:37
@gandhi56 gandhi56 force-pushed the globalisel/g_addsub branch from 57f328e to bd49109 Compare October 30, 2025 14:56
@gandhi56 gandhi56 enabled auto-merge (squash) October 30, 2025 15:43
@gandhi56 gandhi56 disabled auto-merge October 30, 2025 15:44
@gandhi56 gandhi56 merged commit b1d5a2a into llvm:main Oct 30, 2025
10 checks passed
@gandhi56 gandhi56 deleted the globalisel/g_addsub branch October 30, 2025 15:45
aokblast pushed a commit to aokblast/llvm-project that referenced this pull request Oct 30, 2025
)

Add legalization rules for G_ADD, G_UADDO, G_UADDE and their SUB counterparts.
luciechoi pushed a commit to luciechoi/llvm-project that referenced this pull request Nov 1, 2025
)

Add legalization rules for G_ADD, G_UADDO, G_UADDE and their SUB counterparts.
DEBADRIBASAK pushed a commit to DEBADRIBASAK/llvm-project that referenced this pull request Nov 3, 2025
)

Add legalization rules for G_ADD, G_UADDO, G_UADDE and their SUB counterparts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants